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Nonresponse is common in surveys. When the response prob- 
ability of a survey variable Y depends on Y through an observed 
auxiUary categorical variable Z (i.e., the response probability of Y is 
conditionally independent of Y given Z), a simple method often used 
in practice is to use Z categories as imputation cells and construct 
estimators by imputing nonrespondents or reweighting respondents 
within each imputation cell. This simple method, however, is inef- 
ficient when some Z categories have small sizes and ad hoc meth- 
ods are often applied to collapse small imputation cells. Assuming a 
parametric model on the conditional probability of Z given Y and a 
nonparametric model on the distribution of Y, we develop a pseudo 
empirical likelihood method to provide more efficient survey estima- 
tors. Our method avoids any ad hoc collapsing small Z categories, 
since reweighting or imputation is done across Z categories. Asymp- 
totic distributions for estimators of population means based on the 
pseudo empirical likelihood method are derived. For variance esti- 
mation, we consider a bootstrap procedure and its consistency is 
established. Some simulation results are provided to assess the finite 
sample performance of the proposed estimators. 



1. Introduction. Nonresponse is a common phenomenon in sample sur- 
veys. Let y be a variable of interest in a survey. The probability of hav- 
ing a nonrespondent in Y typically depends on the unobserved value of 
Y, which creates a great challenge in the analysis of incomplete survey 
data. A common approach is to assume that the dependence of the non- 
response probability on Y is through an auxiliary categorical variable Z 
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whose values are observed for all sampled units in the survey. More pre- 
cisely, P{6 = 1\Y, Z) = P{5 = 1\Z), where 6 is the response indicator for Y. 
That is, conditional on Z, 6 and Y are statistically independent. This re- 
sponse mechanism is referred to as the unconfounded response mechanism 
by Lee, Rancourt and Sarndal (1994), which is the same as missing at ran- 
dom (MAR) [Rubin (1976)]. In the 2002 Survey of Industrial Research and 
Development (SIRD), for example, a variable Y with nonresponse can be 
one of the wage, fringe benefit, material or depreciation from companies un- 
der study, and a categorical covariate Z is the type of industry (see Table 
1). What the MAR assumption amounts to is that the nonresponse rate 
depends on the type of industry, but does not vary within a particular in- 
dustry. Although the MAR assumption may not always hold, it provides an 
acceptable approximation in many situations [Rubin (1976)] like the SIRD. 

Unbiased or approximately unbiased estimators of parameters such as 
the mean of Y and the mean of Y conditional on Z (e.g., the mean for a 
particular industry in the SIRD) can be constructed using incomplete Y 
values, observed Z values and the MAR assumption. In sample surveys, it 
is often desired to impute nonrespondents and then compute estimates by 
treating imputed values as observed data and using the standard formulas 
designed to produce unbiased or approximately unbiased survey estimators 
in the case of no nonresponse [Kalton and Kasprzyk (1986)]. 

Let {zi, . . . , Zs} be the range for Z, where s is a fixed positive integer. 
A simple method that is often applied in practice is to use zi,. . . ,Zs as 
"imputation cells" and construct estimators by imputing nonrespondents (or 
reweighting respondents) within each imputation cell. Although this method 
of using zi, . . . ,Zs as imputation cells is simple, it often runs into one or both 
of the following problems: 

1. In practice, imputation cells are not necessarily constructed using all cat- 
egories according to Z values, because some imputation cells may have 
small sizes. In some agencies, an internal rule is that in each imputa- 
tion cell, the number of respondents has to be larger than the number 
of nonrespondents. Cells with small sizes are often collapsed to achieve 
this goal. Table 1 displays the nonresponse rate for four variables in the 
SIRD. Although the overall nonresponse rate (the last line of Table 1) 
is between 36.8% and 41.1%, nonresponse rates in many industries are 
higher than 50%. If each imputation cell must have more respondents 
than nonrespondents, then some Z categories (industries) have to be col- 
lapsed. Collapsing cells not only is ad hoc and subjective, but also may 
violate the MAR assumption and create biased survey estimators. More 
precisely, let Z be the categorical variable corresponding to the new im- 
putation cells. Then Z is a function of Z and P{6 = 1\Y, Z) = P[S = 1\Z) 
may not hold. 
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Table 1 

Nonresponse rates (%) for survey items by industry in 2002 SIRD 



Items in SIRD 



Fringe 

Industry Wages Benefits Materials Depreciation 



Food 


33.8 


29.4 


29.5 


32.7 


Beverage and tobacco products 


11.8 


8.1 


3.3 


0.0 


Textiles, apparel and leather 


27.3 


17.0 


30.6 


57.0 


Paper, printing and support activities 


15.5 


50.0 


17.7 


20.1 


Petroleum and coal products 


83.7 


82.3 


82.5 


90.0 


Chemicals 


22.4 


23.8 


22.8 


20.1 


Plastic and rubber products 


33.5 


42.5 


20.4 


43.5 


Nonmetallic mineral products 


17.8 


24.7 


22.8 


21.4 


Primary metals 


7.8 


14.5 


21.0 


2.8 


Fabricated metal products 


25.2 


12.4 


35.3 


31.6 


Machinery 


35.6 


38.3 


37.4 


30.6 


Computers and peripheral equipment 


49.7 


61.5 


50.6 


50.1 


Communication equipment 


64.5 


24.7 


77.4 


23.0 


Semiconducting and other electronic components 


14.7 


10.6 


14.4 


12.5 


Navigation, measuring, electronic 










and control instruments 


56.2 


62.2 


56.6 


58.6 


Other computer and electronic products 


36.1 


37.0 


38.4 


21.1 


Electrical equipment, appliances and components 


15.8 


8.2 


16.8 


12.3 


Motor vehicles and parts 


47.2 


55.6 


41.2 


51.0 


Aerospace products and parts 


22.7 


26.8 


21.8 


18.9 


Other transportation equipment 


39.3 


52.5 


26.3 


16.4 


Furniture and related products 


30.9 


36.4 


38.1 


52.1 


Miscellaneous manufacturing 


57.7 


53.5 


68.6 


41.7 


Mining, extraction and support activities 


86.6 


91.8 


94.4 


93.1 


Construction 


33.2 


54.3 


8.7 


57.3 


Trade 


32.2 


39.4 


46.0 


30.6 


Publishing 


55.5 


58.4 


63.3 


71.9 


Broadcasting and telecommunications 


84.0 


66.8 


74.3 


78.2 


Other information 


5.5 


20.1 


12.9 


49.9 


Finance, insurance and real estate 


29.8 


40.9 


55.5 


45.3 


Architectural, engineering and related services 


32.1 


26.1 


38.1 


67.4 


Computer systems design and related services 


43.3 


40.0 


39.1 


45.0 


Scientific R&D services 


28.8 


29.0 


29.2 


31.6 


Other professional, scientific 










and technical services 


40.2 


33.8 


39.0 


46.1 


Health care services 


12.7 


12.5 


14.4 


35.6 


Other nonmanufacturing 


23.2 


36.8 


59.1 


18.0 


All industries 


38.1 


41.1 


39.3 


36.8 



Source: National Science Foundation/Division of Science Resources Statistics, Survey of 
Industrial Research and Development, 2002. 
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2. Although unbiasedness of survey estimators is the primary concern in 
the development of an estimation and/or imputation procedure, the ef- 
ficiency of survey estimators should also be considered, especially when 
auxiliary data are available. Let Z be the indicator for several data sets 
(e.g., data sets from several different years). Suppose that we need to 
estimate E(Y\Z = zi). If data from different years {Z = zj , j > 1) also 
carry information about E(Y\Z = 2:1), then the estimation efficiency can 
be improved if we use all data sets, not just the single data set with 
Z = z\. The question is how to make use of different data sets. Simply 
pooling different data sets together may introduce some estimation bias, 
since each data set may have its own population distribution for Y . 

The purpose of this paper is to study a pseudo empirical likelihood method 
for estimation and imputation under the MAR assumption and a nonpara- 
metric marginal distribution assumption for Y (which is particularly desired 
for survey data). The empirical likelihood method was developed by Owen 
(1988) and Qin and Lawless (1994) in the context of independent and iden- 
tically distributed data and was extended to survey problems (without miss- 
ing data) by Chen and Qin (1993), Chen and Sitter (1999), Zhong and Rao 
(2000) and Wu and Rao (2006). When missing data are present, Wang and 
Rao (2002) and Wang, Linton and Hardle (2004) considered the approach 
of first imputing missing data based on some method and then applying 
the empirical likelihood to imputed data to obtain more efficient estima- 
tors. Since imputation has to be carried out first, this approach does not 
deal with the problem of small size imputation cells. Assuming a parametric 
model on P{b = 1\Y,Z) (but allowing the dependence of Y in the response 
probability), Qin, Leung and Shao (2002) considered estimation with empir- 
ical likelihoods putting positive mass to observed (Y, Z) only. Chen and Qin 
(2006) used a similar approach for a binary Y. However, the problem of small 
size imputation cells was not considered. Furthermore, none of these cited 
papers contains an imputation procedure using the empirical likelihood ap- 
proach. We derive empirical likelihood estimators of E{y) and E{Y\Z = Zj) 
and some imputation procedures that do not involve any ad hoc method of 
forming imputation cells and provide more efficient estimators than those 
from the simple approach of using zi, . . . ,Zs as imputation cells, at the price 
of assuming a parametric model for P{Z = Zj\Y). To make use of several 
data sets or utilize all Z categories for imputation and estimation in a small 
Z category, some model assumption that relates different data sets or cate- 
gories together is necessary. In survey problems with a continuous variable 
Y, finding a suitable parametric model for P{Z = Zj\Y) is much easier than 
finding an appropriate parametric model for the conditional distribution of 
Y given Z = zj. 
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In our empirical likelihood, there are lots of parameters when s (the num- 
ber of Z categories) is large, which creates problems in numerical compu- 
tation of the solution to the likelihood equation. We adopt a pseudo em- 
pirical likelihood approach by replacing some nuisance parameters in the 
likelihood equation with some simple consistent estimators. The resulting 
estimators may lose some efficiency, but its computation is much more prac- 
tical. Theoretical properties of the pseudo empirical likelihood estimators 
are investigated. 

Section 2 presents details on the sampling design and model, and results 
for estimation without imputation. In addition to the derivation of pseudo 
empirical likelihood estimators, their consistency and asymptotic normality 
are established. Section 3 considers variance estimation by bootstrapping. 
In Section 4, we consider several imputation methods related to the results 
in Section 2. Asymptotic properties of estimators based on imputed data 
are given. Section 5 examines by simulation the finite sample performance 
of the proposed estimators, under some response patterns and models. The 
proofs are sketched in the Appendix. 

2. Pseudo empirical likelihood. Let 7^ be a finite population stratified 
into H strata with A'^;^ units in the hth stratum. Assume that n/j > 2 units 
are sampled from stratum h according to some probability sampling plan, 
independently across the strata. When equal probability sampling is used, 
sampling is either without replacement or with replacement; when unequal 
probability sampling is applied, we assume that sampling is with replace- 
ment, since without replacement unequal probability sampling is not often 
used because of its complexity and the difficulty in deriving variances of 
estimators due to the dependence caused by without replacement sampling 
[Sarndal, Swensson and Wretman (1992), Section 3.6]. According to the sam- 
pling plan, survey weights Whi, i = I, . . . ,nh, h = 1, . . . , H, are constructed 
so that for any set of values {xhi}, 



where Eg is the expectation with respect to sampling and N = ^h=i-^h- 
If qhi is the probability that the ith unit in stratum h is in the sample, 
then the survey weight Whi = {Nqhi)~^. We consider the asymptotic setting 
with a fixed H , n/^ — > oo, and nh/N^ — > for all h. This sampling design 
is commonly used in many business surveys; for example, the Current Em- 
ployment Survey conducted by the U.S. Bureau of Labor Statistics [Wolter, 
Shao and Huff (1998)], the Transportation Annual Survey conducted by the 
U.S. Census Bureau [Census Bureau (1987)] and the Financial Farm Survey 
conducted by Statistics Canada [Caron (1996)]. In the SIRD discussed in 




N 



1 




h=l i=l 
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Section 1, strata are created according to industry group and size of compa- 
nies and, within each stratum, either simple random samphng or probabihty 
(proportionate to company size) samphng is used. 

Let y be a variable of interest in the survey and Z be a categorical 
covariate taking values in {zi, . . . ,Zs}. Within stratum h, we assume that 
(Y, Z) is random and follows a superpopulation model with an unknown 
nonparametric marginal distribution for Y and a parametric probability 
function 

(1) Ph{Z = z\Y = y) = h{y,z,f3), 

where /3 is an unknown parameter vector and fh is a known function. For 
each sampled unit, the Z value is always observed, but the Y value may be 
a nonrespondent. Under the MAR assumption described in Section 1, P{5 = 
1\Y,Z) = (phiz) in stratum h, where (j)h is an unknown function. Because of 
the MAR assumption, we do not need to impose any condition on (j)h except 
that (phi^j) > for any h and zj. Without loss of generality, we assume that 
in stratum h, the first r/j sampled units are respondents and the rest of 
n-h — sampled units are nonrespondents. Thus, the observed data set is 

{(Yhi,Zhi),i = l,...,rh}U {Zhi,i = r/j + 1, . . . ,n/i}, h=l,...,H. 

Let Phi = dFhiYhi) be the point mass Fh places on Yhi- For a particular 
unit {h,i), if i < r/j (Yhi is observed), the likelihood is the joint probability 
density 

(ph {Zhi ) fh (Yhi , Zhi , P)phi ; 

if i > r/i (Yhi is missing), the likelihood is the joint probability density with 
Yhi integrated out, that is, 

'l-MZh^)]fh{y,Zh^,f3)dFhiy) = [l-MZh^i)] J fh{y,Zh^,l3)dFh{y). 

Following the idea in Chen and Sitter (1999), we weight each unit log- 
likelihood by Whi and obtain the log-likelihood 

^ Whi log{(j)hiZhi)fh{Yhi, Zhi, I3)phi) 



E 

h=l 



i=l 



+ Whi\og([l-<Ph{ZM)] I fh{y,Zhi,f3)dFh{y)\ 

Adding the weights Whi is necessary for obtaining approximately unbiased 
estimators under unequal probability sampling. Since (phiZhi) does not in- 
volve P and Fh, we may focus on 

Whi log{fh{Yhi, Zhi, (3)phi) 



^ = E 

h=l 



.i=l 
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for the estimation of parameters related to Y. Within stratum h, let 



TThj = Phiz = Zj) = J fhiy,Zj,f3) dFh{y). 



Since Z takes values zi, . . . ,Zs, L can be written as 



H 



h=l 



^Whi \og{fh{Yhu Zhu P)Phi) + "-hj ^Og{TThj) 



.i=l 



where ahj = Z]j=rh+i '^^«-^{2hi=Zj} indicator function of the 

event A. Applying the empirical likelihood approach, we estimate /3 and Fh, 
h = 1, . . . ,H, by maximizing L subject to 

Phi > 0, ^Phi = 1, ^Phifh{Yhi,Zj,f3) = TThj, 

i=l i=l 

(2) 

j = l,...,s,h = l,...,H. 

Since F^ is nonparametric, its estimate is an empirical distribution with 
Vh points, the observed Yhi, i = 1, . . . ,rh, as the support. Although Y^i, 
i = 1, . . . , r/i, are from the distribution of respondents, we can obtain a valid 
estimator of the marginal distribution F^ using the covariate information 
through the terms J2j C'hj^og{TThj) in L. Using Lagrange multiplier under 
the constraints in (2) and the usual profile empirical likelihood argument, 
we can derive that 

Whi 



Phi 



(3) 



i = l,...,rh,h = l,...,H, 

and obtain estimators of j3 and vr = {TThj,j = 1, . . . ,s,h = 1, . . . , H) by maxi- 
mizing 



H 



h=l li=l 



Whi log 



Whifh{Yhi,Zhi,P) 



Y.ZiWhi-i:Ul^Jh{Yhi,Zj,f3) 



+ Yahj\og{'Khj) 



subject to 



(4) 



Whi[fh{Yhi,Zj,(3) - TThj] 
^=1 - E ■=! ^Jh{YM,Zj,P) 



j = l,...,s,h = l,...,H. 
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When s (the number of Z categories) is not small, it is difficult to max- 
imize 1{P,tt) over (/9,vr) subject to (4). Numerical solutions may be very 
computation-intensive to obtain and they may be unreliable. Hence, we ap- 
ply the idea of pseudo likelihood [Gong and Samaniego (1981)]. Note that 
consistent estimators of the vr/jj are easy to construct. For example, we may 
estimate vr/jj by 

(5) T^hj = WhiI{Z^^=z,} /Y.^hi- 

i=l i=l 

Maximizing the pseudo empirical likelihood /(/3,7r) over fi results in the 
maximum pseudo empirical likelihood estimator (MPELE) /3, where vr = 
i'^hjjj = 1, ■ ■ ■ ,s,h = 1, . . . , H). The MPELE /? can be computed by maxi- 
mizing ?(/9,7r) over (3 using any available software. For example, in the simu- 
lation study in Section 5, /3 was obtained using FMINSEARCH in MATLAB. 

Note that the MPELE is different from the maximum empirical likelihood 
estimator since tt is not vr. The left-hand side of (4) is not when vr is replaced 
by TT and /3 is replaced by the MPELE /3, although we show later that it 
converges to in probability. However, similarly to other cases in which 
the pseudo likelihood is used, we can directly establish the consistency and 
asymptotic normality of the MPELE. 

Let Phi be obtained by using (3) with (3 and tt^j replaced by (3 and ■K^j, 
respectively. The distribution function for Y can be estimated by 

H Th 

G{y) = ^hJ2PhiI{Y^,<y}, 
h=l 1=1 

where Wh = Nh/N, h = 1, . . . ,H. However, because the MPELE is used, 
J2l!LiPhi / 1 (although J2lLiPhi — >p 1) and, hence, G{y) is not a distribution 
function. A modified distribution estimator for Y is 

H Th H 

F{y) = J2 ^hJ2ph^hYM<y}/J2 WhJ2m 

h=l i=l h=l 1=1 

= YPhiI{YM<y}/Y YPhi, 
h=li=l h=li=l 

where phi = WhPhi ■ If the parameter of interest is the finite population mean 
Y = Ef=i Et\ Yh^/N, its MPELE is 

(6) Y= ydF{y) = YYPhiYhi/YYPhi- 

h=li=l h=li=l 
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Given Z = Zj, the conditional distribution of Y can be estimated by 

-^i(y) = ^^Phifh{Yhi,Zj,f5)I{Yt,,<y} / ^^Phifh(yhi,Z.j,^). 
h=l i=l h=l 1=1 

If the parameter of interest is the cell mean Yj, the finite population mean 
of Y given Z = Zj, its MPELE is 



f,= I ydF,{y) 

(7) 



H rn ^ H ru 

= ^Y1 Phifh {Yhi , Zj ,^)Yhil^Yl Phifh {Yhi , Zj , P) . 
h=li=l h=li=l 

The following result shows that the MPELE /3, Y and Yj are consis- 
tent estimators and are asymptotically normal. The proofs are given in the 
Appendix. 

Theorem 1. Assume MAR as described in Section 1 and model (1). 
Suppose that regularity conditions (i)-(v) stated in the Appendix. Then, there 

exists a sequence {Pn,n = 1,2, .. .} such that \\(3n — Pq\\ <n~^/^ and as 
oo, 

(8) ^( ^^^^^'""^ =0)^1 and MPn-f3o)^dN{0,A), 

where A is a positive definite matrix. Furthermore, if condition (iv) in the 
Appendix holds, then 

V^{Y-Y)^dN{0,a^) and V^{Y, -Y,) N{0,a^), 

(9) 

j = l,...,s, 

where a"^ [given by (20)~(21) in the Appendix] and cj| are some positive 
constants. 

The simple method of reweighting respondents within each imputation 
cell (see Section 1) produces the following estimators of Y and Yj: 

s H rih H 

(10) ^ = E E E ^Mhz,.=z,}^h,/ E E 

j=lh=li=l h=li=l 

and 

(11) = E E '^hii{z,,,=z,}^hj/ E E ^hii{z^,=z,}, 

h=li=l h=li=l 
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where 




(12) 



i=l 1=1 



Some comparisons of these esthnators with the MPELE are made in a sim- 
ulation study (Section 5). 

3. Variance estimation by bootstrapping. It is a common practice in 
sample surveys to report a variance estimate for each estimate of the pa- 
rameter of interest. We focus on the most commonly used estimators, the 
mean estimator Y in (6) and the cell mean estimator Yj in (7). Because 

both Y and Yj are complex functions of /? and p^i , it is difficult to derive an 
analytic form of their asymptotic variances, o"^ and o"| in (9). It is shown in 

the Appendix that o"^ is equal to the limit of Z^h^i ^-^h'^I/('^/i-^^)i where 
o"^ is given in (21) in the Appendix and has a complicate form. Thus, we 
apply the bootstrap method, which consists of the following steps. In the 
following 9 denotes f3, Y or Yj. 

1. Within stratum h, draw a simple random sample of size rih with replace- 
ment from the set of sampled units (respondents or nonrespondents). 
Carry out this procedure independently across strata. For each unit in 
the bootstrap sample, the bootstrap data are the Z and Y values (if the 
Y is missing, the bootstrap datum is treated as missing) and its survey 
weight. 

2. Compute 9*, which is the same as 9 but with the original data replaced 
by the bootstrap data generated in step 1. 

3. Repeat the previous steps independently B times and obtain 9*^, . . . , 9*^ . 
Estimate the variance of 9 by the sample variance of 9*^, . . . , 9*^ . 

If there is no nonresponse, then the previously described bootstrap produces 
consistent variance estimators for mean estimators [see, e.g., Shao and Tu 
(1995), Chapter 6]. However, no theory is available for the bootstrap when 
empirical likelihoods are used for nonrespondents. We establish the following 
result for the asymptotic validity of the bootstrap. 

Theorem 2. Assume the conditions in Theorem 1. Let be the 

bootstrap analog of /(•, •) and vr* = (vr^j, j = 1, . . . , s, /i = 1, . . . , H) with vr^^- 

is the bootstrap analog of Tt^j in (5). Then, there exists a sequence {/3*,n = 
1,2,.. .} such that \\j3* — < n^-*^/^ and as n^oo, 



(13) 
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where A is given in (8), P^, denotes the bootstrap probability conditional on 
the data, and -i?* -^d' ^ means P*{'&n & B) — P(i9 G B) -^p for any Borel 
set B. Furthermore, if condition (iv) of Theorem 1 also holds, then 

^{Y - Y) ^rf. iV(0, and ^/^{y] - Y -^d- A^(0, ), 
where o"^ and cr| are defined in (9). 

4. Imputation. Imputation is often carried out for practical reasons [Kalton 
and Kasprzyk (1986)]. After imputation, estimates of parameters are com- 
puted by treating imputed values as observed data and using the standard 
formulas for the case of no nonresponse. In this section we consider impu- 
tation for the estimation of the population mean Y and the population cell 
mean Yj . Let Yhi = ^/li if Ym is a respondent and let Yhi be an imputed value 
if Yhi is a nonrespondent. After imputation, the population mean Y and cell 
mean Yj are estimated by 

(14) Yj = Y,Y.'^h.Yhi 

h=li=l 

and 

h=li=l h=li=l 

respectively. The simple method of using zi, . . . ,Zs as imputation cells im- 
putes nonrespondents in an imputation cell using respondents in the same 
cell only. The simple mean imputation method imputes each nonrespon- 
dent in stratum h with Z = Zj by the cell sample mean Y^j given in (12). 
The simple random imputation method imputes each nonrespondent in stra- 
tum h with Z = Zj by a random sample with replacement from respon- 
dents in stratum h with Z = Zj, where each Y^i with Z^i = Zj has probabil- 
ity Whil{z^,=zj}/J2lli WhiI{Zh,=Zj} to be selected, i = 1, . . . ,r/j. Problems of 
these simple imputation methods are discussed in Section 1. 

Using the MPELE estimators developed in Section 2, we consider the 
following two imputation procedures: 

1. Pseudo Likelihood Mean Imputation. For each nonrespondent in stratum 
h with Z = Zj, its imputed Y value is the mean estimator 

^hj = J2PhifhiYhi, Zj,(3)Yhi I ^Phifh{Yhi,Zj,(3). 

1=1 1=1 

2. Pseudo Likelihood Random Imputation. Each nonrespondent in stratum 
h with Z = Zj is imputed by a random sample with replacement from all 
respondents in stratum h, where the probability of each Yhi to be selected 
is fh{Yhi,Zj,P)phi/jy-tifh{Yhi,Zj,P)phi, i = l,...,r/j. 
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The following result shows that the estimators of Y and Yj based on these 
two imputation procedures are consistent and asymptotically normal. 

Theorem 3. Under the conditions of Theorem 1, for either pseudo like- 
lihood mean imputation or pseudo likelihood random imputation, 



The main difference between the simple (mean or random) imputation 
method and the pseudo likelihood (mean or random) imputation method 
is that the former restricts imputation within each Z category whereas the 
latter uses all respondents with appropriate weighting. Hence, the latter 
avoids the problems described in Section 1 for simple imputation and is 
more efficient when our assumption on P[Z = Zj\Y) holds. 

The asymptotic variances (t| and cr|j do not have simple analytic forms. 
Variance estimation can be carried out using the bootstrap procedure de- 
scribed in Section 3. It should be emphasized that, to address the variability 
caused by imputation, nonrespondents in each bootstrap data set must be 
imputed using the bootstrap data and the same imputation method as that 
used to impute the original data set, as suggested by Shao and Sitter (1996). 

5. Simulation results. In this section, we evaluate by simulation the fi- 
nite sample properties of the MPELE and the pseudo likelihood imputation. 
We create a finite population similar to the Current Establishment Survey 
conducted by the U.S. Bureau of Labor Statistics. We choose four different 
industries as four strata with sizes Ni = 3370, N2 = 2910, = 5430 and 
A'4 = 4110. The variable Y is the total pay for each establishment and values 
of Y in stratum h are generated from a superpopulation F]^ . The form of is 
chosen to be the gamma distribution and Fi = r(43, 0.20), F2 = r(42, 0.19), 
F3 = r(38,0.20) and F4 = r(50, 0.17), wherer(a,6) denotes the gamma dis- 
tribution with shape parameter a and scale parameter b. The parameters in 
FhS are chosen to match the mean and variance of a real data set from the 
Current Establishment Survey. The covariate Z £ {1,2,3,4,5} is generated 
by the proportional-odds model 



^/^{YI-Y)^dN{0,aj) and 
V^{Yji-Yj)^dN{0,a]j), 



j = l,...,s 



where a' 



J and a'^jj are some positive constants. 



log 



P{Z<j\Y = y) 
P{Z>j\Y = y) 



j = 1,2,3,4, 



where /3 is an unknown parameter whose value in the simulation is —0.4. 
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The sampling plan is stratified simple random sampling without replace- 
ment. In each stratum, the sampling fraction is 0.03. For each sampled unit, 
the Y respondent is generated according to the response probability function 

P{6 = l\Z = j)- «-P{-0-l + ^J'} 



l + exp{-0.1 + 7j} 



with an unknown parameter 7. The following table lists values of 7 consid- 
ered in the simulation, the response rate for each Z, and the mean response 
rate E[P{6 = l\Z)]: 



7 


0.7 


0.5 


0.3 


0.1 


-0.1 


P(5 = 1|Z = 1) 


0.6457 


0.5978 


0.5498 


0.5000 


0.4502 


P(5 = l Z = 2) 


0.7858 


0.7109 


0.6225 


0.5250 


0.4256 


P{5^1\Z = 3) 


0.8808 


0.8022 


0.6900 


0.5498 


0.4013 


P((5 = l Z = 4) 


0.9307 


0.8699 


0.7503 


0.5744 


0.3775 


pIs = 1\Z = 5) 


0.9677 


0.9168 


0.8022 


0.5987 


0.3542 


E[P{S^1\Z)] 


0.8852 


0.8224 


0.7154 


0.5634 


0.3888 



For each 7, we run the simulation for 1000 times. Table 2 reports the vari- 
ance (Var) of the proposed MPELE estimators /3, Y in (6), Yj in (7), Yj 

in (14) and Yjj in (15), based on either pseudo likelihood mean imputa- 
tion or pseudo likelihood random imputation. All the relative biases are less 
than 0.3% and hence not reported. To compare the efficiency of the MPELE 
estimators of the means (with imputation or without imputation) with the 
simple estimators using the Z categories as imputation cells, we report the 
ratios of mean square errors (Rat) in Table 2. Each MPELE is compared 

with its counterpart; that is, Y in (6) is compared with Y in (10), Yj in (7) 
is compared with Yj in (11), and Yj in (14) [or Yjj in (15)] with pseudo 
likelihood mean (or random) imputation is compared with Yj in (14) [or 
Yji in (15)] with simple mean (or random) imputation described in Section 
4. To study the performance of the bootstrap. Table 2 also reports the boot- 
strap variance estimators (Vboot) with B = 200 for the MPELE estimators 
and estimators based on pseudo likelihood mean or random imputation. 
In addition, Table 2 reports the simulation coverage probabilities (CP) of 
confidence intervals of the form 

point estimate it 1. 96VVboot 

which approximately have nominal coverage probability 95%. 
The results in Table 2 can be summarized as follows: 
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1. In all cases, the proposed estimators based on the pseudo empirical like- 
lihood (with imputation or not) perform well in terms of the relative bias 
(less than 0.3%) and variance. For the cell mean estimation, our proposed 
estimators are much more efficient than the simple estimators based on 
imputation cells. The ratio of the MSEs can be as small as 0.112 and is 
always less than 0.5 for estimators without imputation. For the overall 
mean estimation, our proposed estimators are still more efficient but the 
improvement is very little. This is expected since "borrowing strength" 
from other imputation cells has a larger impact for the cell mean estima- 
tion than for the overall mean estimation. 

2. When the response probability decreases, the variances of our proposed 
cell mean estimators increase, but their relative efficiencies to the simple 
estimators increase a great deal especially for the imputation methods. 
Note that the number of observations within a Z category increases as 
Z value increases, which results in a decrease in the gain in efficiency 
from our proposed cell mean estimators. These observations from Table 
2 indicate that our proposed cell mean estimators can improve the effi- 
ciency over the simple estimators particularly in the cases where some 
imputation cells have relatively small sample sizes and/or larger number 
of nonrespondents. 

3. In terms of the efficiency, the estimator without imputation is ranked 
the first, and the estimator with random imputation is ranked the last. 
However, imputation may be carried out because of practical reasons 
other than efficiency. 

4. The bootstrap variance estimator works well in most cases in terms of 
its bias and variance (simulation variances are less than 10""^ and not 
reported). The coverage probabilities of the confidence intervals are all 
around 95% with the worst case 91.7% (7 = 0.3, MPELE). 



APPENDIX 
Regularity Conditions for Theorem 1. 

(i) For each /i, there are positive constants and such that ^ = 
kh + 0(1) and ^ = c/^ + o(n"^/^), where n = E^=i "^h- 

(ii) maxj<Ar^ n^l (Nqhi) = 0(1) and N^^"^ n-h/qhi ^ dh for some con- 
stant dh, h = 1, . . . ,H . 

(iii) fhiu, z, 13) is twice continuously differentiable in [5 for any ^, y and z, 

anatunctions II q^q^ 1| , || 1| , || — g^g^ — 1| ancl||[ jx 

|-9/h(|^fei3) jT||2 bounded by a function g{Y) with Eh[g(Y)] < oo in a 
neighborhood of Po, the true value of /3, j,k = 1, . . . ,s, where is the 
expectation under F^. 



Table 2 

Simulation results based on 1000 simulation rounds and 200 bootstrap rounds 



Response pattern 



:0.7 -c = 0.5 -c = 0.3 7 = 0.1 7 =—0.1 H 



Method Var Vboot CP Rat Var Vboot CP Rat Var Vboot CP Rat Var Vboot CP Rat Var Vboot CP Rat 

P 0.0001 0.0001 0.0001 0.0001 0.0001 

MPELE Y 0.0036 0.0036 96.9 0.971 0.0038 0.0038 95.2 0.974 0.0042 0.0044 93.8 0.972 0.0054 0.0055 93.7 0.956 0.0079 0.0082 94.9 0.940 



Yi 0.0051 0.0045 92.4 0.114 0.0056 0.0049 93.2 0.112 0.0060 0.0056 92.7 0.115 0.0076 0.0069 92.8 0.120 0.0102 0.0096 93.4 0.143 ^ 

Y2 0.0045 0.0040 94.3 0.141 0.0049 0.0042 93.5 0.141 0.0053 0.0049 91.7 0.149 0.0065 0.0061 93.4 0.147 0.0091 0.0086 92.9 0.166 f 

Ys 0.0038 0.0037 94.2 0.239 0.0042 0.0039 94.3 0.239 0.0046 0.0045 92.9 0.241 0.0057 0.0056 94.0 0.226 0.0083 0.0081 93.2 0.252 ^ 

Yi 0.0038 0.0036 95.8 0.244 0.0039 0.0038 94.5 0.263 0.0044 0.0044 93.7 0.249 0.0058 0.0056 92.3 0.248 0.0084 0.0082 94.7 0.225 ^ 

0.0048 0.0046 95.7 0.410 0.0050 0.0049 95.2 0.424 0.0056 0.0056 94.0 0.461 0.0073 0.0072 93.5 0.423 0.0112 0.0112 92.9 0.377 g 

Mean Y 0.0036 0.0036 96.9 0.970 0.0038 0.0038 95.2 0.973 0.0042 0.0043 93.4 0.967 0.0054 0.0055 93.7 0.951 0.0079 0.0080 95.3 0.938 § 

impu- Yi 0.0231 0.0229 95.9 0.493 0.0216 0.0223 95.3 0.443 0.0213 0.0213 94.3 0.396 0.0215 0.0216 92.7 0.352 0.0225 0.0230 94.1 0.341 O 

Y2 0.0204 0.0204 94.2 0.683 0.0191 0.0193 94.9 0.590 0.0180 0.0184 93.8 0.481 0.0186 0.0182 95.0 0.389 0.0188 0.0184 95.5 0.329 g 

tation Y-j, 0.0143 0.0142 95.7 0.831 0.0130 0.0134 96.1 0.741 0.0124 0.0129 93.6 0.613 0.0114 0.0124 94.6 0.465 0.0132 0.0134 94.5 0.381 ^ 

Yi 0.0134 0.0136 95.7 0.904 0.0137 0.0133 95.1 0.827 0.0121 0.0125 92.8 0.690 0.0124 0.0122 94.0 0.505 0.0128 0.0132 95.6 0.336 '2, 

0.0106 0.0109 95.5 0.966 0.0109 0.0108 94.6 0.913 0.0106 0.0109 95.2 0.812 0.0108 0.0113 93.4 0.660 0.0139 0.0141 94.1 0.470 O 

Random Y 0.0038 0.0039 97.0 0.987 0.0043 0.0043 94.5 1.003 0.0050 0.0052 95.3 0.893 0.0066 0.0067 93.4 0.964 0.0097 0.0098 95.6 0.970 2 

H 

impu- Y^ 0.0315 0.0322 95.4 0.559 0.0324 0.0329 95.7 0.534 0.0322 0.0334 95.4 0.497 0.0363 0.0348 94.8 0.498 0.0364 0.0373 94.6 0.446 

Yz 0.0245 0.0249 94.5 0.736 0.0254 0.0255 93.9 0.682 0.0267 0.0264 95.3 0.541 0.0283 0.0281 94.0 0.507 0.0311 0.0302 95.5 0.460 O 

Ys 0.0159 0.0157 96.1 0.802 0.0156 0.0160 95.5 0.766 0.0164 0.0171 95.2 0.651 0.0176 0.0184 94.4 0.567 0.0218 0.0215 95.0 0.517 ^ 

Yi 0.0142 0.0145 95.8 0.920 0.0154 0.0150 94.8 0.805 0.0155 0.0159 93.4 0.744 0.0182 0.0175 94.3 0.627 0.0205 0.0210 95.3 0.438 H 

Ys 0.0109 0.0112 95.2 0.971 0.0118 0.0116 94.9 0.901 0.0124 0.0128 95.2 0.817 0.0148 0.0153 94.6 0.677 0.0204 0.0204 94.8 0.541 

The relative biases are all less than 0.3%. Var: variance of the estimators. Vboot: bootstrap variance estimator. CP: coverage probability 

in % of 95% confidence interval. Rat: ratio of the MSE to the MSE of simple estimator. 
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(iv) For each h, there exists a Zj such that is 
positive definite. 

(v) 4>h{zj) is positive for any h and Zj. 

(vi) \\y-^^^-^j^p^^\\'^ is bounded by an integrable function in a neighbor- 
hood of /3o for each j. 

Proof of Theorem 1. For any function /(/?), we use the notation 
= df{l3)/d(5 and /"(/?) = a2/(/3)/9/35/3^ Let S„ = {/?: - /3o|| < 
n-i/3|^ g^^^ ^ 1^. _ ^^11 ^ „-i/3}^ ^ For the first con- 

clusion in (8), it suffices to show that 

(16) P{l{P) - 1{I5q) < for ah /5 G dBn) 1. 

The function /(/?) is equal to J2h=i^hif3) plus a term that does not depend 
on P, where 

rh 

h iP) = Whi log fh {Yhi ,Zhi,l3) 

i=l 

- V log ( 1 - V -j^^ fh(yhi,Zj,(3)\ 
i=i \ j=iWhnhj J 

and Wh = J2^=i 'Whi/J2h=i ^1=1 "^hi- Thus, it suffices to show that (16) holds 
with /(/3) replaced by lh{l3) for each h. When (3 G dBn, 

(17) h{o) - = {(3- PoYiM) + - PoVihmiP - /3o), 

where /3* is between /? and f3o. Define An = 1'^{(3q) /W^, 
Cn,j = - [1 - Mzj)], and 

^ Ej=l fh{yhi,Zj,f3o)Cnj/Ej=l Mzj)fh{Yhi, Zj,(3o)]'^ 

l-J2j=lfh{yhi,Zj,l3o)Cn,j/J2j=l4'h{Zj)fh{Yhi,Zj,f3o)' 

Then An = A^n + A^n + A^^n + ^4n, where 



/I 1 fh{Yhi,Zhi,f3o) 



W, t[ ^'T.U^h{z,)UY^,,,z^,Poy 

A _ ST J_V^ f'hiyhi,Zj,/3o)fh{Yhi, Zj',Po) 

W.rr,, W, t EUMzj)MY,,,z„(3o) 
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We finish the proof of (16) by the fohowing four steps. 

Step 1. Show that — >p 0. Let Ec denote the conditional expectation 

of {Y,Z) given 5 = 1 in stratum h. Then Am Ph{6 = l)^c^|f^ = 
E|=i 4'h{zj)TT'^j{(3o), where TThjiP) = I hiv, Zj,P) dFh{y). Since (l)h{zj) is pos- 
itive, there exists a positive constant ujq such that 4>hizj) > loq for h = 
l,...,H and j = l,...,s. Since ujq < 'Ej=i4'hizj)fh(yhi,Zj,f3o) < 1 and 
\Y.j=ifh(Xhi,Zj,(3o)cn,j\ <J2j=i\cn,j\, we have dn,i = Op{J2j=i\cn,j\) uni- 
formly in i. Also, ^JnhCnj — >d A^(0,r^) for some r^. Hence, 

A,n - op^2_^^ ^^^^^ h EUMzmy^^^^j^M) 

Similarly, = Op{ni^ ) = Op{l). Finally, ^2n -^p J2j=i[^-(t>h{zj)]7rL{Po). 
Then A„ follows from Ej=iMzjWj{P) +Ej=ii^- Mzj)]7rhj{P) = 1- 
Step 2. Show that y^n^An — >d A^(0, S/j), where S/j is p x p and p is the 
dimension of /3. Write + A2n + ^Sn as 5(1^ T,7=i WhiH^hi, Po)), where 

= (^hi) 

I Jh{yhi, ^hi, Po) 

Tjy _ ^ l\ ^hif'h(yhi,Zj,Po) I 

Y.j=l<Ph(Zj)fhiyhi,Zj,(5o) ' 

Shifh{yhi,Zj,Po)fh{Yhi, Zji,Po) 

and 51 is defined as 

aiCmi ■ ■ ■ ,^s,Ci, ■ ■ ■ ■ ■ ■ ,(^s,rn, ■ ■ ■ ,ns, ■ ■ ■ ,Tsi, ■ ■ ■ ,Tss) 



j=i ^3 7=1 i'=i ^1 



5:^(l-0,(z,O)f^r,,, 



i=ij'=i 

with Cj'^jyTjj' being p-dimensional and being real numbers. By the 

— 1/2 

central limit theorem, the (5-method and A^n = Op{n^ ) (proved in step 1), 
y/n^[An - g{E{(j){xhi,Po)))] = Vnh[g{w;;J2"=iWhi<l){xhi,Po)) - g{E{(l){xhi, 

/3o)))] + Op(l) -^d -^(0, S/i), where S/^ is a p x p matrix. Since An -^p 0, 
g{E{(j){xhi, (3q))) = and the result follows. 
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Step 3. Show that D„ = l'/^{P*)/Wh — >p —Uh, where C//i is a positive defi- 
nite matrix. Write Dn = A^n + ^6n + ^7n, where 

1 



i=l 



A 



6n 



1 ^ E|.i^/u>^..,-.,/?*)]E|.i^/a^..,-.,/3*r 



^7„ = ^E-. 



Since 
Eh 



—j2whi{iogfh)"iYh^,Zhi,n -—J2^h^(^ogfh)"(Yhi,Zh^,Po: 



<—maxwhiEh max \\ {log fh)" {y, z, f3) - (log //,)"(?/, /3o)|| ^ 0, 



A5n -^p MzjWhj iPo) 



my,Zj,(3o)[f'f^iy,z„Po)r' 
fh{y,Zj,(3o) 



■dFhiy). 



Under condition (iii), for t = 1 or 2, since ujq < J2j=i 4>h{zj)fhiYhi, Zj,(3Q) < 1 
and \J2j=ifh(yhi,Zj,(3*)cn,j\ <J2j=i\cn,j\, we have 

1 



(18) 



r(l + Op(l)), 



(.J:j=l^h{zj)^{Yh^,ZJ,f3o)y 

uniformly in i. Then ^gn Ej=i[l " '/'/i(2i)]^hj(/5o), and 



A 



-7n 'p 



Hence, Dn -^p —Uh with Uh equals 

4>h{zj )fh{y, Zj , /3o ) [/^ {y,Zj,Po)Y 
~{ fh{y,Zj,Po) 

EU'^hiz,)f'^iy,z„(3o)[j:UMzj)f'hiy,^j,Po)V 
E j=i 4>h{zj)fh{Yhi , Zj , /3o ) 



dFh{y). 



dFh{y). 
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By Cauchy's inequality and condition (iv), Uh is positive definite. 

Step 4. Show that - IhiPo) < for all /? G dBn) 1 for each h. 

When P G dBn, P = Pq + n~^^^u with \\u\\ = 1. Then by (17) and results in 
step 3, 



n 



1/3 



^u^UhU + Op{l) 



Let A be the smallest eigenvalue of Uh. Since Uh is positive definite, A > 0. 
Then 



n 



<\^>P 



w, 



1. 



The result follows since u'^Uhu/2 > A/2. We now prove the second con- 
clusion in (8). It follows from step 2 of the previous proof that \/nl'{Po) = 

ELiWh^^l'hiPo) NiO,^), where ^ = Eticl^h/kh, K = 
lim.„^oo and Ch = lim„^oo ^/i/-^ are defined in condition (i). Let U = 
J2h=i ChUh- It follows from step 3 of the previous proof that l"{Po) = J2h=i x 
^hi.Po)lWh -U. By Taylor's expansion and the fact that l'{(3) = 0, -l'{Po) = 
- Pori"iPo) + Op{\\0 - PoYl'mW)- Then 



(19) 



p-Po = [-i"mr'i'{Po) + op{\\ p-Po 

iiua.\ L 

l"m]-'V^nPo) + Op{l) ^dN{0,A), where 

1 + 



Hence P-po = H" {Po)]~^l' (Po) i+op{i) = Op{n-^/^). Then, by (19) and the 
result in step 2, \/n{P — Pq) = 

a = u-^j:u-\ 

By (18), we can show that Ei=iPfei 1- Then J2h=iWhJ2lLiPhi 
Op{l). Let kh{P)=EZiPMYhi and th{P) = EZiPM- Then 



-1/2N 



Y -Y = Y - EY + Op(n-^/2) 

H 

= E Wh[kh0) - th0)EY]{l + Op{l)) + Opin 

h=l 
H 

= E Wh[kh{P^) - th{Po)EY + 0- poVik'hiPn - fhiP^EY)] 

h=l 

+ Op{n-^/^), 

where P* is between P and p. By (18), we can show that fc^/3*) -^p Chi and 
t'hiP*) — >p Ch2 for some constants Chj- Let c = J2h=i c/i(c/ii — Ch2EY). Then 



H 



V^{Y -Y) = Y,Wh 

h=l 



kh{Po)-thiPo)EY + 



c-U-\{Po) 



Wh 



+ Op{l). 
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Similarly to step 2 in the previous proof, we can write 



khm-thmEY. 



1 



1/2 



i=l 



for a function h and a vector ip. Let 



j_ 

-1/2. 



E{(ph), Vh = w^ J2i=i WhMxhi,Po) and Eh^p = E{iph). Recall that l'h{l^o)/Wh 



g{4'h) +Op{nf^ ) and g{Eh4>) = 0, where 4>{xhi,(io) and are defined in step 
2 of the previous proof. By the central limit theorem and the 5-method, 
n^[h{^h) + c^U-^g{^h) - h{Eh^)] ^d7V(0,cj2), where 



(20) 



al = [h'{Ehv),dU~^g'{Eh<P)] 

X [dhEh{^,4>){^,cpY - Eh[^,4>)Eh{^,4>y] 
X [h'{Eh^),eU-^g'{Eh(^)Y 
and dh is defined in condition (ii). Then 



H 



(21) Y.^h 



h=l 
,2 



where cr'^ =J2h=ic'io-l/kh. It follows from (21) that 



(22) 



H 

h=l 



kh{Po)-thiPo)EY + 



H 



^ Chh{Eh(p). 



h=l 



0. 



But the left-hand side of (22) equals Y:h=iWh{EhY - EY] + Op(l) 
Therefore, J2h=i '^hh{Ehif) = 0. Then, it follows from (21) and = + 
o(n-i/2) that ^{Y-Y) N{0, cj^). The proof for ^{Yj-Yj) ^d N{0, a]) 
is similar. This shows (9) and completes the proof of Theorem 1. □ 

Lemma A.l. Assume the conditions of Theorem 1. Let Xhi = {Shi,Yhi, Zhi), 
i = 1, . . . ,nh and {x*/^^, . . . , x*f^^^} be a bootstrap sample. Assume that ip{x, (3) 
is continuous in j3 and ||^/;(x,/3)|p is bounded by an integrable function in 
a neighborhood of (5^; then ^J27=i'^hi'^i^hi^P) o-'^d ^YJi=iWhi'^{xhi,P) 
converge to Ehipi^Xhi, (3o) in probability and 



h h 



>d* iV(0,/), 



where <5"^j = Var^(y^n^-j^ t(;^j'(/;(x^j, /3)) and Var^ denotes the bootstrap 
variance conditional on the data in stratum h. Furthermore, 



, dhEhii'^Xhi, (3o)'il'ixhi, PoY] - Eh[ip{xhi, f3o)Eh'il'{xhi, Po) 
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which is the asymptotic variance of^j^J27=i''^hi'>P{xhi,f3o), where dh is de- 
fined in condition (ii) of Theorem 1. 

Proof of Theorem 2. We follow the proof of Theorem 1. Let fl* = 
{I3:\\(i-fi\\ < and = {/3 : ||/?-/3|| = n~^/'^}. When /? E dB^, there 

exists /3* between p and (3 such that /*(/?) - /*(/?) = (/3 - (Z^) + 

K/?-/3rEf=iC(/3'^)(/3-/3)-^ 

Step 1. Show that A* = 1^{(3)/Wh -^p 0. Let A*„ be the bootstrap ana- 
log of Ajn in the proof of Theorem 1. Then A* = A*„ + A^^ + + Al^. 
By Lemma A.l, AJ„ P/,((5 = l)EJ'f^{y,z,(3o)/fh{y,z,PQ) and 

E-.J(i - M^^nm = D^^s^JI;!^]- Let c;, = 5^ - (1 - 

(f)h{zj)). By Lemma A.l and the 5-method, y/nfiCnj = Op{l). Hence, the 
same argument used in the proof of Theorem 1 leads to A^^ = Op(l) and 

Step 2. Show that y^r)^(A* — An) — N{0,T,h). Using the notation in 
step 2 of the proof of Theorem 1, we have A* = Yh=i ^hi'^i^M^ (^)) + 
Op(.'>T'h^^'^)- By Lemma A.l, the (5-method and g' {^r^J2l=iWhi(f>{xhi, 
(^)y^l9'{wi: Sr=i '^hi X <j){xhi,P)) Sh, we have 



/ 1 . \ / 1 . 



Then the result follows from l'f^0)/Wh = g{j^ T,7=i WhiH^hi, ^)) + Op{nf^ ^^^). 

Step 3. Show that 1*^1' {j3*) /Wh — >p —Uh-, where Uh is defined in the proof of 
Theorem 1. Note that lt'm/Wh = Al^ + Al^ + A*^^. When ||/;;(y,z„/3)f 
is bounded by an integrable function Gj{y), maxj<„^ [[/^(y^^jZj,/?**)!! < 

maxi<„,^ \\f'f^{Yhi,Zj,l3**)\\ < maxi<„^ G]^^(Y^i) = Op{n]/^), where 13** is be- 
tween /3* and /3. For t = 1 or 2, similarly to (18) we have 

1 



(23) 



(i-E,^=i^A(n*,^„/3^))* 

h hj 



[l + Op(l)) 



ij:'j=,Mzj)fhiY,:„z„p)Y 

uniformly in i. The result follows from Lemma A.l and similar arguments 
to step 3 in the proof of Theorem 1. 

Step 4. Show that P* (/*(/?) - 10) < for all /3 G dB:^) ^p 1. The proof is 
similar to step 4 in the proof of Theorem 1 , using the results established in 
steps 1-3. 
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This proves the first conclusion in (13). The proof of the second conclu- 
sion in (13) follows from Lemma 1 and the same argument in the proof of 
Theorem 1. 

Let kl0*) and t^(/3*) be the bootstrap analogs of kh{(3) and th0), re- 
spectively, defined in the proof of Theorem 1. By Lemma A.l and (23), 

r* 

J2itiPhi = 1 + Op(l). A similar argument to the proof of Theorem 1 yields 

H 



h=l 



/ 1 . 



(24) 



/ 1 "-h •< 



+ Op{l) 



Note that 



H / 1 



h=l 

H 



h=l 



Applying Lemma A.l and the 5-method to (24), we can show that ^/n{Y — 
Y) -^d* N{0,a'^). Similar arguments can show that ^/n{Yj — Yj) — 

.2' 



Proof of Theorem 3. The proofs for the mean imputation estimators 
are similar to that of Theorem 1. Conditional on the sample, the mean of the 
random imputation estimators is equal to the mean imputation estimators. 
Then the results for the random imputation estimators follow from those 
for the mean imputation estimators and Lemma 1 of Schenker and Welsh 
(1988). □ 
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